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SIGNIFICANCE  AND  EXPLANATION 

The  complexity  or  work  required  for  a numerical  procedure  is  some- 
times measured  in  terms  of  the  number  of  function  evaluations  required 
to  yield  an  error  of  a given  size.  There  are  several  effective  algorithms 
which  are  based  on  a sequential  method  of  selecting  function  evaluations. 

For  instance, binary  search  for  a root  of  a function  or  Fibonacci  search 
for  the  maximum  of  a unimodal  function  are  examples  of  sequential 
procedures. 

In  a sequential  procedure  the  prior  function  values  are  used  to  pick 
an  optimum  location  to  evaluate  the  function  next.  An  alternative  to  a 
sequential  selection  is  to  choose  all  the  function  values  at  once,  how- 
ever many  have  been  decided  to  be  used.  Such  an  approach  is  simpler  but 
typically  yields  a slower  rate  of  convergence.  For  instance,  the  error 
for  binary  search  is  geometrically  decreasing  in  the  number  of  function 
values  while  a preassigned  strategy  yields  only  an  error  decreasing  at 
a rate  inversely  proportional  to  the  number  of  function  values. 

It  is  the  purpose  of  this  paper  to  identify  a class  of  estimation 
problems  in  which  sequentiality  will  not  yield  a faster  rate  of  conver- 
gence than  a deterministic  choice.  This  class  includes  problems  of 
numerical  quadrature  and  differentiation.  Tbus  we  demonstrate  that  the 
complicated  procedure  of  sequential  estimation  can  be  replaced  by  a much 
simpler  strategy  of  a deterministic  choice  for  some  important  problems 
of  numerical  calculation. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
summary  lies  with  MRC,  and  not  with  the  authors  of  this  report. 


OPTIMAL  SEQUENTIAL  AND  NON-SEQUENTIAL 
PROCEDURES  FOR  EVALUATING  A FUNCTIONAL 

Shmuel  Gal  and  Charles  A.  Micchelli 

1.  Framework  and  Defintions. 

In  this  paper  we  compare  certain  optimal  procedures  for  sampling  a function  f belong- 
ing to  seme  prescribed  class  for  the  purpose  of  estimating  a functional  Uf.  Our  main  ob- 
jective is  to  identify  a large  collection  of  examples  in  which  sequential  procedures  are  not 
advantageous.  Several  unsettled  questions  whose  solution  would  illuminate  the  problem 
studied  here  are  outlined  at  the  end  of  section  3. 

We  begin  with  a family  of  real  valued  functions  defined  on  [0,1),  F » { f } , and  a func- 
tional Uf  defined  for  all  f € F.  Assume  that  for  any  f c F we  can  make  n observations 

of  f at  points  x,,...,x  and  obtain  the  information  f(x,)  » y.,...,f(x  ) “ y . The  set 
* n linn 

of  uncertainty  in  Uf  is 

p(x;f)  = {U  f:f  e F,  = f(x.)},  x « (x,,...,x  ). 

ii  in 

(For  sinplicity  we  shall  consider  the  case  of  x.^  e [0,1],  but  all  the  results  hold  for  func- 
tions defined  on  any  bounded  subset  of  Rm. ) As  a measure  of  the  size  of  this  set  we  take 

<1>  g(x;f)  = g(x  ,...,x  ; f ) 

i n 

= sup  Uv>  - inf  Uf 
f £ W(x;f)  i ft  W(x;  f) 

where  W(x;f)  » t1,:*  £ F,  *(x^)  - f(x^)},  i.e.,  g is  the  length  of  the  smallest  interval  con- 
taining o(x;f).  The  function  g(xsf)  is  defined  for  x £ C *=  {x  » (x, ,...,x  ) : 0 < x,  < 1} 

n 1 n — i — 

and  f £ F.  We  will  cospare  three  policies  for  choosing  x, ,...,x  . 

1 n 

To  this  end,  let  E be  some  prescribed  subset  of  [0,1]  from  which  we  will  saroole  a 
function  f c F.  In  practice,  we  are  typically  constrained  to  sample  from  some  fixed  finite 
subset  of  [0,1).  We  will  frequently  assume  E has  this  property. 

Now,  let  En»E*Ex...*E  (n  times)  and  consider  the  following  three  procedures. 

* ■ 
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a.  Deterministic. 


The  accuracy  guaranteed  is  defined  as 


dn  * inf  sup  g(x;f) 

xeE  feF 
n 


b.  Random. 


Random  observations  are  preassigned  by  a probability  distribution  y on  E. 
The  accuracy  guaranteed  here  is 

r = inf  sup  / g(x;f)dy(x) 

" y f«F  E 


(We  will  only  allow  probability  measures  dy  on  E for  which  g(x;f)  is 

n 

y-integrable. ) 


c.  Sequential. 


A sequential  search  procedure  is  a set  of  n functions  h^ , h2,...,hn  where 

hx  « xx  is  a constant,  x2  = hjtx^f  (x^  ) , . . . ,xi+1  « (x^  , . . . ,xi  ,f  (x^ ) , 

. ...etc.  This  procedure  produces  an  h(x;f)  e E , h * (h, ,...,h  ).  The  totality 

n 1 n 

of  sequential  procedures  will  be  denoted  by  S^. 

The  accuracy  guaranteed  by  a sequential  procedure  is  given  by 

(4)  s = inf  sup  g(h(x;f);f) 

n h£S  f£F 
n 

Remark  1.  Obviously  rn  <_  dn  and  s^  <_  d^  for  any  subset  E of  [0,1].  The  following 

example  is  a simple  instance  in  which  s <<  r . 

n n 

Let  E - [0,1],  F - { f : f (0)  - -1,  f (1)  - +1,  f (a)  > 0 + f(b)  >0  for  a < b < 1}  and 

Uf  » inf  z such  that  f(z)  > 0 (i.e.  Uf  is  the  "root"  of  f) . If  we  let  x-«  0,  x • 1 

o n+i 

then  g(x>f)  » x^  - x^  ^ where  j is  the  smallest  index  such  that  y^  « f(x^)  > 0.  Hence 

inf  sup  g(xif)  ■ inf  sup(x^  • jl  ■ 1/n+l.  To  estimate  rn  we  define  k2  e F by 
x f x i 

k (t)  ■ -1,  0 < t < i,  and  +1  for  i < t < 1.  Then 
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r = inf  sup  J g(x,-f)dv(x) 
n _ ' 

u f En 

1 

>.  inf  j J g(x;k  )du(x)dz 

U 0 E Z 

n 

n+1  xi 

» inf  / If  (x.  - x.  )dzdu(x) 

v E i-1  X.  , 1_ 

n i-1 

n+1  , 

- inf  f £ (x.  - x ) <iu(x)  >_  1/n+l  . 

p E i-1 

n 

On  the  other  hand , the  bisection  (sequential)  procedure  yields  an  accuracy  of  (1/2) n and  in 
fact  it  is  easily  seen  that  sn  = (l/2)n. 

In  contrast  to  this  situation  we  will  show  below  that  there  is  a wide  class  of  problems 

for  which  r < s . 

n — n 

2.  Linear  Functionals  Defined  On  Convex  Sets. 

In  this  section  we  assume  that 

(5)  F is  a convex  family 
and 

(6)  U is  a linear  functional  . 

Lemma  1.  Suppose  (5)  and  (6)  hold.  Then  g(x;f)  is  a concave  function  of  f £ F. 

Proof.  Let  f^,  f2  £ F.  Then  for  any  £ > 0 there  exist  f^»  f^  £ F with 
f.(x.)  - f.(x.)  = ^(Xj),  i = 1,  2,  j = 1 , . . . <n  and 

(7)  gtxjT)  £11(7^  - iHfj)  + £ . 

Thus  by  (7)  and  the  linearity  of  U 

g(x,-ef1  + <i-e>f2)  > u(e?1  + (l-0)72>  - UOfj^  + ( 1— 0 ) jf 2> 

- - u(Xx)  1 + (i-0)[u(?2)  - o(f2)j 

>_  0g(x;  f^)  + (l-0)g(X(f2)  - £ . 


Letting  £ -*  0 proves  the  lemma. 
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Theorem  1.  If  (5)  and  (6)  hold  and  E is  a finite  subset  of  [0,11  then 


(see  (3)) 


This  theorem  is  a consequence  of  the  following  version  of  Ky  Fan's  minimax  theorem  [2], 
which  we  state  below. 

Theorem  A.  Let  S,  T be  sets  with  S a compact  Hausdorff  space.  Let  ♦ be  a real-valued 
function  defined  on  S * T which  is  continuous  and  convex  with  respect  to  s,  and  concave  in 
t.  Then  there  exists  an  s t S such  that 

min  sup  $(s,t)  =*  sup  $(s,t)  = sup  min  i(i(s,t) 
scS  teT  tfT  teT  stS 

For  the  proof  of  Theorem  1 we  define 

♦ (p,f>  = / g (x;  f ) dti  (x) 

En 

where  u ranges  over  all  probability  measures  on  E^.  Given  the  usual  topology  on  (p) 
(n-dimensional  simplex) , the  theorem  follows  from  Lenina  1 and  the  definition  of  r^. 

This  result  leads  us  to 

Corollary  1.  If  (5)  and  (6)  are  satisfied  and  E is  a finite  subset  of  [0,11  then 

d > s > r (Sequentiality  does  not  help!), 
n — n — n 

Proof.  Given  e > 0 there  is  an  f e F with 


f g(x;f)dp(x)  > r - e 
* — n 


for  all  probability  measure  p . In  particular,  g(x;f)  2.  rn  “ € f°r  all  x e E^.  Thus,  for 
any  sequential  search  procedure  h = h(x;f)  we  have  g(h(x;f);f)  > - t.  Consequently, 

s » inf  sup  g(h(x; f) ; f)  > r - e. 

" hfS  f eF  “ " 

n 

Remark  2.  A randomized  sequential  procedure  combines , in  an  obvious  way,  the  features  of  a 
random  and  a sequential  search.  Using  the  same  argument,  it  follows  directly  that  even  ran- 
domized sequential  procedures  do  not  produce  better  accuracy  than  the  preassigned  randomized 
procedures  considered  in  this  paper. 


We  show  below  with  an  example  that  Theorem  1 is  not  valid  in  general  for  infinite  subsets 


E of  {0,11. 

Example  1.  Let  E - {0,11,  F * {f:  there  is  a z,  0 < * < 1 such  that  f is  strictly  in- 
creasing in  {0, zl  and  f(t)  = f(z),  t c (z,l]  and  |f(t)  | <,  1,  t € [0,111  and 

Uf  « lim  f(t)  . Note  that  F is  convex  and  U is  linear,  but  E is  not  a finite  set.  Now, 
t*l" 

for  n » 2 we  have  inf  q(x^,X-;f)  » 0 and  thus 
X1'X2 

sup  inf  / g(x;f)dy(x)  « 0 . 

f V 

However,  for  any  y there  exists  a q,  0 < q < 1 such  that  y (x^  and  are  not  in  [q,l)l 
> 1/2.  Hence  for  e small  and 


-1  + tr , 0 < t < q 


-1  ♦ q<»  q < t < 1 


it  follows  that  if  one  of  the  x^’s  is  not  in  [c,l)  then  g(x^,x2;f  ) >_  2 - Thus 


/ g(xjf  )dy  > 1 - e/2  and 


inf  sup  } g(x;  f)  dy  >_  1 - e/2 
y f 


The  fact  that  the  functions  in  F need  not  be  continuous  at  one  may  seem  artificial, 
nevertheless,  it  is  possible  to  construct  other  examples  of  this  type  in  which  all  the  func- 
tions in  F are  continuous . 

Remark  3.  If  we  allow  nature  (our  opponent)  to  choose  an  f from  some  fixed  finite  subset 
F*  of  F by  means  of  a probability  distribution  dv  then  the  accuracy  we  can  obtain  with 


our  best  choice  of  dy  is 


min  max  / / g(x;  f)dy  (x)dv(f) 


y v E F* 
n 


By  the  standard  version  of  the  minimax  theorem  this  number  equals 


max  min  / / g(x;  f)  dy  (x)  dv(f) 


v y E F’ 
f! 


and  optimal  strategies  dy  (x) , dv  (x)  exist  even  if  (5)  or  (6)  is  not  satisfied.  In  this 
case  even  though  nature  will  choose  the  data  according  to  the  distribution  dv*  we  know  by 


« 


our  previous  example  that  a sequential  procedure  can  guarantee  a much  smaller  accuracy.  Why 
is  this  so?  The  explanation  to  this  phenomenon  which  is  contained  in  Corollary  1 is  as 
follows:  When  (5)  and  (6)  are  in  force  then  nature  has  a universal  worst  function  f 
which  can  be  used  against  any  sequential  procedure  for  selecting  x » (x, ).  Thus 
we  cannot  hope  to  "learn"  about  f t F by  using  sequentiality.  On  the  other  hand, 
if  nature  has  to  choose  among  a set  of  functions  (randomized)  then  we  can  really  "learn" 
something  about  the  function  which  was  chosen.  This  was  the  case  in  the  method  of  bisection. 

Remark  4.  Let  du  be  the  optimal  probability  distribution  for  choosing  x » (x  , ...x^)  and 

* 

suppose  there  is  an  optimal  strategy  of  nature  f among  all  f £ F. 

Then 


• . * * 

r » inf  g(x,f  ) « J g(x,f  )du  (x) 

n xeE  E 

n n 

* * * 

and  thus  the  support  of  u is  a subset  of  all  x c E which  satisfy  g(x,-f  ) « inf  g(x,f  ). 

n x 

This  condition  may  help  to  find  the  optimal  search  strategy,  i.e.,  our  plan  against  the  data 
corresponding  to  the  worst  function. 

Below  we  offer  some  further  examples  which  show  that  certain  extensions  of  Theorem  1 are 
not  possible. 

Example  2.  Here  we  define  a convex  set  F and  a nonlinear  convex  functional  U and  demon- 
strate that  Theorem  1 is  not  valid  for  this  case. 

Let  E - [0,1] , 

F « { f : f concave,  |f’(x)|  <_  1,  x e [0,1]} 

and 

Uf  * sup  f(x) 

0<x<l 


U is  obviously  a convex  functional  and  F a convex  set. 

For  0 < z x 1 define  •f  e F as  >P  (x)  « x for  0 < x <_  z and  f (x)  « 2z  - x for 
— - — z z z 

x < x < 1.  Thus  for  any  n observations  x, ,...,x  (which  we  assume  for  convenience  are 
— in 

ordered  0 * xQ  < x^  <• . .<  xn  < *n4.^  ” D > Xj  <_  z < xj+j  then 

g(Xi*z>  » z - maxl^lx.)  , *x*xj-il"  “ ™in*z  “ x j ' xj+i  “ • 

Therefore 
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I 


/ g(x1#...(xn  ; ^)dz 


. * 2 * x 

/ (x  -z)dz  + j (z-x.)dz  + J (x  -z)dz  +...+  { (z-x  )c 


x (x  -x  )2  (x  -x  _)2  (1-x  )2 

JL  + _J 1 + , n n-1  n 

2 4 +"*+  4 + 2 


and  it  follows  that  r , d > — . 

n n — 4n 

On  the  other  hand,  since  all  f e F are  unimodal  we  cam  achieve  with  a Fibonacci 
search  an  accuracy  of  about  an,  a * .62  (the  Golden  section)  for  the  location  of  the  maximum. 
Moreover,  since  |f'|  1 we  may  then  locate  the  value  of  the  maximum  uf»  within  an  as 

well.  Thus  sn  <_  (.62)  n <<  r^,  d^ . A completely  analogous  example  can  be  constructed  with  a 
concave  functional . 

1 

Example  3.  In  this  example  E = (0,11,  Uf  = Jf(x)dx  and  F={f:0<z<l},f  = character- 

0 z z 

istic  function  of  [z,lj.  Here  U is  a linear  functional  and  F is  a nonconvex  set. 

For  z e (x^,  x^+J] , we  have  g(xsf)  =*  x^+1  - x_.  and  so 


1 n+1  2 1 
/gUl Vfz>d*  " l <xi  " Vl>  -TZl  ■ 


Thus  rn  _>  1/n+l.  But,  if  we  use  the  bisection  sequential  procedure  then  we  may  locate  z 

with  error  (l/2)n  and  so  the  interval  of  uncertainty  about  Uf  is  s ■ (l/2)n  <<:  r . 

n n 

Examples  2 and  3 remain  valid  if  we  take  E to  be  a finite  (but  large)  set  of  points 

(e.g.,  E = — , i = 0,  1 , . . . ,m  ). 

n> 

3.  Centered  Sets. 

We  will  say  a set  F is  centered  (about  f ) if  there  exists  an  f f F such  that 

c c 

whenever  f ( F then  2f  -ftp. 

c 

Lenina  2.  Suppose  (5)  and  (6)  hold  and  F is  centered  about  fc-  Then  for  all 
x » (x^,...,xn>  < En  and  f e F 

q(xif)  < g(x»f  > . 

— c 


Proof . Since  f £ F if  and  only  if  2f^  - f £ F 


W(2f  - f ; x)  = 2f  - W(f;x)  . 
c c 


(see  (1)) 


Hence  we  see  that  g(x;?f  - f)  = g(x;f)  for  all  f £ F.  Now,  it  is  an  easy  matter,  in  view 

c 


of  Lenina  1,  to  prove  the  lemma 


g(x; f)  = — ( g ( x ; f ) + g(x;2fc  - f ) ) 


<g(x;fc)  . 


Theorem  2.  If  there  exists  an  f £ F such  that  for  all  x £ E and  f £ F 
c n 


(9) 

then  for  all  n 


g(x,f)  <_  g(x,fc> 


(10) 

and 

(11) 


d = r = s 
n n n 


fc  is  the  optimal  strategy  of  nature  independently  of  n 


Proof:  Let  v = inf  g(x,f  ).  It  is  obvious  that  by  using  f , nature  can  keep  the  payoff 
_ c c 

X€E 

n 


to  be  at  least  v against  any  randomized  or  sequential  procedure.  On  the  other  hand  for  any 


e > 0 it  is  possible  to  find  an  x^  which  satisfies  g(x£:  fc>  < v + e,  so  that  any  f e F 


g(x£;  f)  <_  g(X£s  fc>  < V + E. 


Thus,  the  deterministic  search  procedure  X£  keeps  the  payoff  below  v + e.  It  follows 


that  fc  is  optimal  and  x^  is  e-optimal. 


Remark  5.  Note  that  for  Theorem  2 to  hold,  it  is  sufficient  that  condition  (9)  is  satisfied 
and  no  other  assumptions  (such  as  linearity  of  U',  convexity  of  F or  finiteness  of  E)  are 
necessary. 

The  proof  of  Theorem  2 actually  presents  a method  for  finding  the  optimal  search  proce- 
dure. The  rule  in  this  case  is  simple,  we  just  have  to  find  a set  of  points  x^.-.x^ 


which  minimizes  the  interval  of  uncertainty  for  the  case  in  which  nature  uses  f^. 
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« „ 


We  also  note  that  a related  result  under  a stronger  hyDothesis  appears  in  Bakhvalov  [1). 

Needless  to  say  there  are  many  interesting  examples  to  which  Theorem  2 applies.  Below 
we  mention  one  from  the  theory  of  optimal  quadrature. 

Example  4.  Let  F = {f:f  is  absolutely  continuous  and  |f(z)  - f(y)  | <_  m|z  - y|,  0 <_  z,  y <_  1} 
1 

and  Uf  = / f(t)dt.  Then  the  assumptions  of  Theorem  2 hold  (since  0 is  the  center  of  F) 

0 

and  so  the  observation  points  x,  ,...,x  are  deterministic  and  preassigned  and  have  to  satis- 

l n 


g(x  ,...x  ; 0)  = min  g(x,,...,x  ;0) 

1 n 0<x,<...<x  <1  1 n 

— 1 n— 


It  easily  follows  that 


_ , n-1 

.21  V ..  .2 


g(x; 0)  - 2M [x  J (x  - x-)2  + (1  - X > ] . 

1 2 1+1  i n 

. 1 

* 1_  2 

This  function  has  a unique  minimum  for  x.  = i = n and  we  conclude  that 

i n 

r = d = s = M/2n 
n n n 

1 

In  this  example  we  may  replace  Uf  by  any  positive  linear  functional  Uf  = / f(t)dy(t) 

0 

and  F by  {f:f  abs.  cont.  |f'(t)|  £ b(t) , a.e.}.  Again  r^  * d^  = s but  in  general  the 

* 

optimal  x^  are  not  equally  spaced  and  difficult  to  find  explicitly.  They  correspond  to  the 


minimum  of 


g(x;  0)  = 2 


” X1  _ _ n-l/zi  _ 

\ (b(x  )-b(t))dY(t)  + I ] (b(t)-b(x.))dY(t) 

U i«l\x 


Xi+1  _ _ \ 1 

+ J tb<xi+1)  - b(t))dY(t)j  + / (b(t)-b(xn))dY(t) 

Zi  ' Xn 


b(t)  = / b(t)dt 
0 


and  satisfies 


b(z.)  - b ( x . ) = \ (b(x.  ) - b(x  ))  . 

1 1 2 1+1  1 

Our  last  example  has  the  property  that  (5)  and  (6)  hold  and  rn  < d^. 

In  our  next  example  we  observe  that  the  conclusion  of  Theorem  2 may  remain  valid  in  the 
absence  of  a center  for  F.  This  example  raises  the  issue  of  the  extent  to  which  the  require- 
ment in  Theorem  2 that  F have  a center  can  be  weakened. 

1 

Example  5.  Let  F = {f:f  increasing,  0 <_  f(x)  <_  1,  for  0 <_  x <_  1}  and  Uf  « j f(x)dx. 

0 

Then  (5)  and  (6)  hold,  however,  F does  not  have  a center.  The  following  remarks  are  a for- 
mal proof  of  the  latter  statement.  Assume  that  a center  f exists.  Since  0 e.  F then 

2f  - 0 = 2f  £ F so  that  f < 1/2.  But  1 e F which  implies  f > 1/2.  Thus  f » 1/2 
c c c — c — c 

which  is  impossible  because  then  f(x)  = x e F implies  2fc(x)  - f(x)  = 1 - x is  increasing. 

* 

Nevertheless  we  will  show  that  f (x)  = x is  a universally  worst  function  and  d = r . 

n n 

* 

It  is  easily  verified  that  for  any  x = (x, ,...,x  ) 0 < x <...<  x <1  and  f (x)  * x 

in  — in  — 

g(x;  f*)  = x*  + (x2-Xl)2  +...+  <*„-*„_!>  2 + <!-*„>  2 - ' 

1 * 

that  is,  nature  can  guarantee  a value  of  — - by  choosing  f . On  the  other  hand,  if  we  use 

n+i 

* i 

x.  « —r  then  for  any  f e F 
i n+1 


" S+1  f(n+T>  + +-"+  - f(nTT 

+ ^rf1  - «;&»] 


Thus  the  searcher  can  guarantee  a value  of  by  a deterministic  strategy  so  that 

d = 1 = s . In  addition,  we  know  f*  is  a universally  worst  function  for  all  n.  This  ex- 

n n n 

ample  has  been  mentioned  by  Kiefer  in  [3] . 


Our  last  example  has  the  property  that  (5)  and  (6)  hold  and  rn  < dn> 


* 


Example  6.  For  any  interval  I C [0,1]  we  let  f be  its  characteristic  function.  Let 

I. be  a partition  of  10,1),  I.  n I = *.  j i1  k,  UI,  * 10,1)  with  and 

x m 3 k ,1  1 m 

i 

define 

F * { I Vifi  I °<  I v - 1) 

i”l  i i«l 

(step  functions) . For  U we  choose  an  arbitrary  linear  functional  given  by 


n 

uf  - l y ^ 

i=i  3 


and  as  usual  E * 10,1).  We  define  a mapping  i:  [0,1]  -*■  {l,...,m}  by  the  condition  that 
t e Xi(t)  and  set  Kx)  “ {i(x^) ,. .. ,i(xn) ) , x » (x  ,...,x  ) . Then  it  is  easily  seen  that 

g(x;f)  = (1  - l y .)  ( max  a.  - min  a.) 

jel(x)  3 j^I(x)  3 j 4 1 < x)  3 

= 1 y ( max  a.  - min  a.) 

j{l(x)  3 j|l(x)  3 j|l(x)  3 

Hence  for  a^  = j , j = 1 , . . . ,m  we  have  d^  = m - 2 while  for  the  randomized  procedure  de- 
fined by 


m-1 

-= , 3 » 1,  m 


PrfXj^  e I ) 


1 < j < m 


it  follows  that  for  all  f e F 

1 . 

/ g (x  ; f)  du  (x. ) - [(1-y)  + (1-y  )](m-2) 

0 1 1 m2-2m+2  1 m 


m- 1 2 . . 

r m— 2 . . . , , , m — 2m+l  . _ 

l — r Cl-y . ) (m-1)  = -r (m-2)  < m - 2 

j=2  m -2m+2  3 m -2m+2 


Thus  r^  < d . 


All  our  results  have  in  effect  compared  d , r , s for  a fixed  n.  It  would  be  useful 

n n n 

to  determine  under  conditions  (5)  and  (6)  whether  these  quantities  can  be  asymptotically 

_ d 

different,  in  other  words  when  is  Tim  — < ». 

r 

n 


-11- 


The  ideas  we  have  presented  have  wider  applicability.  In  the  next  section  we  will 
comment  on  the  optimal  estimation  of  operators  from  sampling  in  the  presense  of  noise. 


4.  Extensions. 


In  applications  it  is  frequently  unrealistic  to  assume  that  the  data  f (x^) , . . . ,f (xn> 
is  known  exactly.  Function  values  are  usually  only  inaccurately  determined  as  a result  of 
either  experimental  or  computational  error.  We  measure  these  errors  with  a norm  ||  • ||  on 
Rn  and  say  that  ftx^)  - y^^  + e^  where  ||e||  <_  1 (normalized),  e • In  this 
case  our  uncertainty  in  U is  the  set 


{U*:*  e F,  « ffx^  + e^,  ||e||  <_  1} 


and  the  corresponding  g is 


g (x;f)  « sup  U >fi  - inf  U ^ 

*H  (xsf)  *H  (x;f) 
e e 


H^(x;  f)  * { & * € F,  *(x^)  * f(x^)  ♦ ei»  ||e||  _<  1} 

G 6 6 

Continuing  with  this  analogy  we  introduce  d , r , s , as  in  Section  1,  and  it  is  easily 

n n n 

seen  that  all  the  results  of  Section  2 remain  valid  for  inaccurate  data. 

Our  discussion  in  Section  2 and  3 also  lends  itself  to  the  estimation  of  operators  be- 
tween linear  spaces.  Thus  in  this  case  U is  an  arbitrary  mapping  from  a linear  space  X 

into  another  linear  space  Z.  Our  class  F is  now  some  subset  K of  X and  the  information 
about  f which  may  be  used  to  estimate  Uf  is  denoted  by  If.  Earlier  If  was  a sample  of 

function  values  ( f (x, ) , . . . ,f (x  ))  . How  we  allow  I to  be  any  linear  mapping  from  X 
l n 

into  Rn,  If  = (I,f,...,I  f),  I.  :X  -*■  R i.e.  n linear  observations  of  f.  As  before  we  will 
l n i 

measure  the  error  in  the  observation  If  with  a norm  ||*||  on  Rn  and  thus  the  set  of 
uncertainty  is 

Q(f;  I)  ■=  {U*I*  - If  + e,  ||  e ||  < 1,  e F)  . 

In  this  case,  Q is  not  an  interval  but  rather  a (quite  arbitrary)  subset  of  Z.  Therefore 

we  require  some  measure  of  the  size  Q.  For  this  purpose,  we  assume  that  Z is  a normed 

linear  space  and  recall  that  the  Chebyshev  radius  of  a set  T C z is  defined  as 
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r(T)  = inf  sup  ||x-y|| 
XrZ  ytT 


The  radius  of  a set  serves  as  an  effective  measure  of  its  size  and  we  let 


g(I;f)  - g(I. I ; f ) 

l n 


r(Q(f;I) ) . 


This  definition  differs  by  a factor  of  1/2  from  the  one  we  used  earlier  in  (1)  when  z « R. 

An  attractive  feature  of  this  way  of  measuring  the  size  of  Q is  that  it  corresponds 
to  the  best  way  to  "fill  in"  the  diagram 

U 

X — » Z 


with  a mapping  A given  only  that  I*  « If  + e,  ||e||  < 1 and  f e K.  These  ideas  are  dis- 
cussed in  [4]  (the  proofs  of  Lemma  1 and  2 are  based  on  remarks  in  [4]). 

We  quote  the  following  result  from  [4;  p 2] . 

Theorem  B.  If  U is  a linear  operator  from  X to  Z and  K is  a convex  set  centered  about 

the  origin  6 e X then 

g(Iif)  < 2g(I;8)  . 

With  this  result  we  may  proceed,  as  in  Section  2,  to  show  that  a sequential  search  can- 
not do  better  than  1/2  a deterministic  one.  A sequential  search  in  this  context  means  that 
seme  set  L of  linear  functionals  on  X is  prescribed  (in  section  2 this  was  the  set  of 
point  evaluations) . Then  a sequential  method  based  on  n observations  from  L is  deter- 
mined by  a function  h = (h,,...,h  ),  h.:X  * Of  ». . .*  Y.  ■*  L where  h,I,  = I,.  The  function 

1 n i iTT^  1 1 1 

selects  the  information 

h(I)f-  (I,f,  h,  (f , I,f)  ,...,h  (f,  I,f,...,I  ,f)) 

i-*i  n l n-1 

- d.f,  i,f,...,i  f) 
i * n 

and  a sequential  search  can  guarantee  an  error  of 


sn  ■ “ inf  sup  g(h(I)f;f) 

h ftF 
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while  a deterministic  approach  yields 


d 

n 


d (L)  * inf  sup  g(I 
" I i t L f£F 


I if)  . 


n 


Theorem  3.  Let  the  hyoothesis  of  Theorem  B be  satisfied.  Then 


1 d < s < d 

2 n — n — n 


Proof.  The  right  hand  inequality  is  obvious.  To  Drove  the  remaining  inequality  we  let  h 

* * * * 

be  any  sequential  procedure.  For  £ > 0,  there  exists  I m Lei  such  that 


inf  g(I, I ;0)  + £ > q(I*, . . . ,1*;  6) 

_ _ 1 n n 

VL 

(0  » center  of  K) 


Thus  from  Theorem  B 

sup  g(h(If)  ;f) 
f£K 

>_  g(h(I0)  i 0 ) > g(I*,...,I*i0)  -£ 

1 * * 

> - sup  g(I1,...,Inif)  -£ 


Hence,  letting  £ ■*  0+  we  see  that  any  sequential  procedure  cannot  achieve  an  accuracy  less 

than  -•  d . This  proves  the  theorem. 

2 n 

When  K is  a unit  ball  given  by  a Hilbert  space  semi-norm  and  the  data  I is  known 

exactly  then  (when  no  further  assumptions  on  Y , Z)  the  factor  2 in  Theorem  B can  be  removed 

[5;  p.  10).  For  inaccurate  data  a similar  result  holds  when  Y can  Z are  Hilbert  spaces 

t<).  In  each  of  these  cases  d ■ s . 

n n 

It  is  interesting  to  note  that  when  the  set  L is  chosen  to  be  the  set  of  all  contin- 
uous linear  functionals  then  d corresponds  to  the  Gel1 f and  n-width  of  the  set 

n 

OK  » (Uf : f£K) , since 


I 
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